Sub-quadratic Markov tree mixture learning based on randomizations of the Chow-Liu algorithm
نویسندگان
چکیده
The present work analyzes different randomized methods to learn Markov tree mixtures for density estimation in very high-dimensional discrete spaces (very large number n of discrete variables) when the sample size (N) is very small compared to n. Several subquadratic relaxations of the Chow-Liu algorithm are proposed, weakening its search procedure. We first study näıve randomizations and then gradually increase the deterministic behavior of the algorithms by trying to focus on the most interesting edges, either by retaining the best edges between models, or by inferring promising relationships between variables. We compare these methods to totally random tree generation and randomization based on bootstrap-resampling (bagging), of respectively linear and quadratic complexity. Our results show that randomization becomes increasingly more interesting for smaller N/n ratios, and that methods based on simultaneously discovering and exploiting the problem structure are promising in this context.
منابع مشابه
Towards sub-quadratic learning of probability density models in the form of mixtures of trees
We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger rand...
متن کاملProbability Density Estimation by Perturbing and Combining Tree Structured Markov Networks
To explore the Perturb and Combine idea for estimating probability densities, we study mixtures of tree structured Markov networks derived by bagging combined with the Chow and Liu maximum weight spanning tree algorithm, or by pure random sampling. We empirically assess the performances of these methods in terms of accuracy, with respect to mixture models derived by EM-based learning of Naive B...
متن کاملMixtures of Bagged Markov Tree Ensembles
Key points: •Trees → efficient algorithms. •Mixture → improved modeling. There are 2 approaches to improve over a single Chow-Liu tree: Bias reduction, e.g. EM algorithm [1] •Learning the mixture is viewed as a global optimization problem aiming at maximizing the data likelihood. •There is a bias-variance trade-off associated with the number of terms. • It leads to a partition of the learning s...
متن کاملConditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series
We consider the problem of modeling discrete-valued vector time series data using extensions of Chow-Liu tree models to capture both dependencies across time and dependencies across variables. We introduce conditional Chow-Liu tree models, an extension to standard Chow-Liu trees, for modeling conditional rather than joint densities. We describe learning algorithms for such models and show how t...
متن کاملA Generalization of the Chow-Liu Algorithm and its Application to Statistical Learning http://arxiv.org/abs/1002.2240
Learning statistical knowledge from data takes large computation. We eventually compromise between the accuracy and the time complexity of the learning algorithms by choosing its approximation to the best solution. In this paper, we address how to efficiently estimate the dependency relation among attributes values by constructing an undirected graph (a Markov network) via the ChowLiu algorithm...
متن کامل